Automatic Speech Recognition System for Real Time Applications

نویسندگان

S. Preethi

B. Arivu Selvam

چکیده

Speech is human’s most efficient communication mode. Beyond its efficiency, humans are comfortable and familiar with speech. Other modalities require more concentration, restrict movement and cause body strain due to unnatural positions. This need brings the development of a speech to text conversion. In spoken language, syllables are often considered as the phonological "building blocks" of words. Depending on the language and the sounds used, a phoneme may be written consistently with one letter; however, there are many exceptions to this rule. The range of the possible applications is wide and includes: voicecontrolled appliances, fully featured speech-to-text conversion, automation of operator-assisted services, and voice recognition aids for the handicapped. This project implements a removal of additive noise and conversion of speech to text form. Spectral substraction is used to remove the noise present in speech. Next is the segmentation process done with the help of group delay algorithm. Recognition plays a major role in speech to text conversion. Letter recognition is achieved with simple Gaussian Mixture Model (GMM). Word recognition is a challenging scenario for researchers and is extracted by HMM with more accuracy main application of the framework is hand free data entry. Mobile and medical environment also use this speech to text conversion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

طراحی و پیاده‌سازی سامانۀ بی‌درنگ آشکارسازی و شناسایی پلاک خودرو در تصاویر ویدئویی

An automatic Number Plate Recognition (ANPR) is a popular topic in the field of image processing and is considered from different aspects, since early 90s. There are many challenges in this field, including; fast moving vehicles, different viewing angles and different distances from camera, complex and unpredictable backgrounds, poor quality images, existence of multiple plates in the scene, va...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Automatic Speech Recognition System for Real Time Applications

نویسندگان

چکیده

منابع مشابه

طراحی و پیاده‌سازی سامانۀ بی‌درنگ آشکارسازی و شناسایی پلاک خودرو در تصاویر ویدئویی

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

Improving the performance of MFCC for Persian robust speech recognition

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

عنوان ژورنال:

اشتراک گذاری